Abstract: The big data is the concept of largespectrum of data, which is being created day by day. In recent years handling these datais the biggest challenge. Hadoop is an open source platform which is used effectively to handle the big data applications. The two core concepts of the hadoop are Mapreduce and Hadoop distributed file system (HDFS). HDFS is the storage mechanism and map reduce is the programming language. Results are produced faster than other traditional database operations. Pig and Hive are the two language which helps us to program the mapreduce framework within short period of time.

 

Keywords: MapReduce, Pig, Hive, Big data, Hadoop, HDFS.